fix: refresh reused sandbox skills after local skill updates#6085
fix: refresh reused sandbox skills after local skill updates#6085stablegenius49 wants to merge 1 commit intoAstrBotDevs:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses an issue where reused sandbox sessions might operate with outdated local skills. By introducing a robust skill revision tracking system, the changes ensure that any modifications to local skills are detected and automatically synchronized to the sandbox before it's reused, thereby maintaining consistency and preventing unexpected behavior due to stale skill sets. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Hey - 我发现了 1 个问题,并留下了一些整体性的反馈:
- 在
_compute_local_skills_revision中,建议考虑处理文件系统竞争条件(例如,在rglob和stat之间文件被删除),以避免一次短暂的FileNotFoundError就导致整个修订版本计算失败。 - 对于缺失的 skills 目录,目前修订版本语义不一致:
_compute_local_skills_revision返回"missing",而在 skills 缺失时创建的 booter 从不会设置修订版本。这会导致每次复用时都进行不必要的重新同步。更干净的做法是在_sync_skills_to_sandbox的提前返回路径中也设置一个修订版本,或者统一处理None和"missing"。
给 AI Agent 的提示词
Please address the comments from this code review:
## Overall Comments
- In `_compute_local_skills_revision`, consider handling filesystem races (e.g., files deleted between `rglob` and `stat`) so a transient `FileNotFoundError` doesn’t cause the whole revision computation to fail.
- The revision semantics for a missing skills directory are inconsistent (`_compute_local_skills_revision` returns 'missing' while booters created when skills are absent never get a revision set), which will cause a needless resync attempt on every reuse; it would be cleaner to set a revision even on the early-return path in `_sync_skills_to_sandbox` or normalize `None` vs 'missing'.
## Individual Comments
### Comment 1
<location path="tests/test_computer_skill_sync.py" line_range="77-86" />
<code_context>
+def test_sync_skills_sets_revision_after_success(monkeypatch, tmp_path: Path):
</code_context>
<issue_to_address>
**suggestion (testing):** Strengthen the revision assertion to check the exact expected value rather than just existence
This test currently only checks that `_astrbot_skills_revision` is truthy. Since `_compute_local_skills_revision` is deterministic for a given tree, it would be more robust to assert the exact revision value, e.g. by:
- Computing the expected revision via `computer_client._compute_local_skills_revision(Path(skills_root))` and comparing it to `booter._astrbot_skills_revision`, or
- Patching `_compute_local_skills_revision` to return a known sentinel (e.g. `"rev-123"`) and asserting the attribute matches it.
That way the test verifies that `_sync_skills_to_sandbox` propagates the correct revision, not just a non-empty value.
Suggested implementation:
```python
def test_sync_skills_sets_revision_after_success(monkeypatch, tmp_path: Path):
skills_root = tmp_path / "skills"
temp_root = tmp_path / "temp"
skill_dir = skills_root / "custom-agent-skill"
skill_dir.mkdir(parents=True, exist_ok=True)
skill_dir.joinpath("SKILL.md").write_text("# demo", encoding="utf-8")
temp_root.mkdir(parents=True, exist_ok=True)
monkeypatch.setattr(
"astrbot.core.computer.computer_client.get_astrbot_skills_path",
lambda: str(skills_root),
```
```python
expected_revision = computer_client._compute_local_skills_revision(skills_root)
assert booter._astrbot_skills_revision == expected_revision
```
I’ve assumed the existing assertion is a simple truthiness check like `assert booter._astrbot_skills_revision` and that the test has `computer_client` and `booter` variables available in scope after invoking `_sync_skills_to_sandbox`. If the variable names differ (e.g. `client` instead of `computer_client`, or `skill_booter` instead of `booter`), adjust the `expected_revision = ...` line accordingly to use the correct instance. Also ensure the call to `_compute_local_skills_revision` uses the same type used elsewhere (e.g. `skills_root` as a `Path`; if the method expects a string, wrap with `str(skills_root)`).
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进后续的评审。
Original comment in English
Hey - I've found 1 issue, and left some high level feedback:
- In
_compute_local_skills_revision, consider handling filesystem races (e.g., files deleted betweenrglobandstat) so a transientFileNotFoundErrordoesn’t cause the whole revision computation to fail. - The revision semantics for a missing skills directory are inconsistent (
_compute_local_skills_revisionreturns 'missing' while booters created when skills are absent never get a revision set), which will cause a needless resync attempt on every reuse; it would be cleaner to set a revision even on the early-return path in_sync_skills_to_sandboxor normalizeNonevs 'missing'.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_compute_local_skills_revision`, consider handling filesystem races (e.g., files deleted between `rglob` and `stat`) so a transient `FileNotFoundError` doesn’t cause the whole revision computation to fail.
- The revision semantics for a missing skills directory are inconsistent (`_compute_local_skills_revision` returns 'missing' while booters created when skills are absent never get a revision set), which will cause a needless resync attempt on every reuse; it would be cleaner to set a revision even on the early-return path in `_sync_skills_to_sandbox` or normalize `None` vs 'missing'.
## Individual Comments
### Comment 1
<location path="tests/test_computer_skill_sync.py" line_range="77-86" />
<code_context>
+def test_sync_skills_sets_revision_after_success(monkeypatch, tmp_path: Path):
</code_context>
<issue_to_address>
**suggestion (testing):** Strengthen the revision assertion to check the exact expected value rather than just existence
This test currently only checks that `_astrbot_skills_revision` is truthy. Since `_compute_local_skills_revision` is deterministic for a given tree, it would be more robust to assert the exact revision value, e.g. by:
- Computing the expected revision via `computer_client._compute_local_skills_revision(Path(skills_root))` and comparing it to `booter._astrbot_skills_revision`, or
- Patching `_compute_local_skills_revision` to return a known sentinel (e.g. `"rev-123"`) and asserting the attribute matches it.
That way the test verifies that `_sync_skills_to_sandbox` propagates the correct revision, not just a non-empty value.
Suggested implementation:
```python
def test_sync_skills_sets_revision_after_success(monkeypatch, tmp_path: Path):
skills_root = tmp_path / "skills"
temp_root = tmp_path / "temp"
skill_dir = skills_root / "custom-agent-skill"
skill_dir.mkdir(parents=True, exist_ok=True)
skill_dir.joinpath("SKILL.md").write_text("# demo", encoding="utf-8")
temp_root.mkdir(parents=True, exist_ok=True)
monkeypatch.setattr(
"astrbot.core.computer.computer_client.get_astrbot_skills_path",
lambda: str(skills_root),
```
```python
expected_revision = computer_client._compute_local_skills_revision(skills_root)
assert booter._astrbot_skills_revision == expected_revision
```
I’ve assumed the existing assertion is a simple truthiness check like `assert booter._astrbot_skills_revision` and that the test has `computer_client` and `booter` variables available in scope after invoking `_sync_skills_to_sandbox`. If the variable names differ (e.g. `client` instead of `computer_client`, or `skill_booter` instead of `booter`), adjust the `expected_revision = ...` line accordingly to use the correct instance. Also ensure the call to `_compute_local_skills_revision` uses the same type used elsewhere (e.g. `skills_root` as a `Path`; if the method expects a string, wrap with `str(skills_root)`).
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def test_sync_skills_sets_revision_after_success(monkeypatch, tmp_path: Path): | ||
| skills_root = tmp_path / "skills" | ||
| temp_root = tmp_path / "temp" | ||
| skill_dir = skills_root / "custom-agent-skill" | ||
| skill_dir.mkdir(parents=True, exist_ok=True) | ||
| skill_dir.joinpath("SKILL.md").write_text("# demo", encoding="utf-8") | ||
| temp_root.mkdir(parents=True, exist_ok=True) | ||
|
|
||
| monkeypatch.setattr( | ||
| "astrbot.core.computer.computer_client.get_astrbot_skills_path", |
There was a problem hiding this comment.
suggestion (testing): 将修订版本的断言加强为检查精确的预期值,而不仅仅是检查其是否存在
当前这个测试只验证 _astrbot_skills_revision 是否为 truthy。由于 _compute_local_skills_revision 对于给定的目录树是确定性的,更稳健的做法是断言精确的修订版本值,例如:
- 通过
computer_client._compute_local_skills_revision(Path(skills_root))计算期望的修订版本,并与booter._astrbot_skills_revision比较;或者 - 将
_compute_local_skills_revision打补丁为返回一个已知的哨兵值(例如"rev-123"),然后断言该属性与之相等。
这样测试可以验证 _sync_skills_to_sandbox 传播的是正确的修订版本,而不仅仅是一个非空值。
建议的实现:
def test_sync_skills_sets_revision_after_success(monkeypatch, tmp_path: Path):
skills_root = tmp_path / "skills"
temp_root = tmp_path / "temp"
skill_dir = skills_root / "custom-agent-skill"
skill_dir.mkdir(parents=True, exist_ok=True)
skill_dir.joinpath("SKILL.md").write_text("# demo", encoding="utf-8")
temp_root.mkdir(parents=True, exist_ok=True)
monkeypatch.setattr(
"astrbot.core.computer.computer_client.get_astrbot_skills_path",
lambda: str(skills_root), expected_revision = computer_client._compute_local_skills_revision(skills_root)
assert booter._astrbot_skills_revision == expected_revision我假设当前的断言只是一个简单的 truthy 检查,例如 assert booter._astrbot_skills_revision,并且在调用 _sync_skills_to_sandbox 之后,测试中可以在作用域中访问到 computer_client 和 booter 变量。如果变量名不同(例如使用 client 而不是 computer_client,或使用 skill_booter 而不是 booter),请相应调整 expected_revision = ... 这一行来使用正确的实例。另请确保调用 _compute_local_skills_revision 时使用与其他地方一致的类型(例如使用 Path 类型的 skills_root;如果该方法期望字符串类型,则用 str(skills_root) 包装)。
Original comment in English
suggestion (testing): Strengthen the revision assertion to check the exact expected value rather than just existence
This test currently only checks that _astrbot_skills_revision is truthy. Since _compute_local_skills_revision is deterministic for a given tree, it would be more robust to assert the exact revision value, e.g. by:
- Computing the expected revision via
computer_client._compute_local_skills_revision(Path(skills_root))and comparing it tobooter._astrbot_skills_revision, or - Patching
_compute_local_skills_revisionto return a known sentinel (e.g."rev-123") and asserting the attribute matches it.
That way the test verifies that _sync_skills_to_sandbox propagates the correct revision, not just a non-empty value.
Suggested implementation:
def test_sync_skills_sets_revision_after_success(monkeypatch, tmp_path: Path):
skills_root = tmp_path / "skills"
temp_root = tmp_path / "temp"
skill_dir = skills_root / "custom-agent-skill"
skill_dir.mkdir(parents=True, exist_ok=True)
skill_dir.joinpath("SKILL.md").write_text("# demo", encoding="utf-8")
temp_root.mkdir(parents=True, exist_ok=True)
monkeypatch.setattr(
"astrbot.core.computer.computer_client.get_astrbot_skills_path",
lambda: str(skills_root), expected_revision = computer_client._compute_local_skills_revision(skills_root)
assert booter._astrbot_skills_revision == expected_revisionI’ve assumed the existing assertion is a simple truthiness check like assert booter._astrbot_skills_revision and that the test has computer_client and booter variables available in scope after invoking _sync_skills_to_sandbox. If the variable names differ (e.g. client instead of computer_client, or skill_booter instead of booter), adjust the expected_revision = ... line accordingly to use the correct instance. Also ensure the call to _compute_local_skills_revision uses the same type used elsewhere (e.g. skills_root as a Path; if the method expects a string, wrap with str(skills_root)).
There was a problem hiding this comment.
Code Review
This pull request effectively addresses the issue of stale skills in reused sandbox sessions by introducing a revisioning system based on file metadata. The implementation is sound and is accompanied by appropriate regression tests. My review includes a critical fix for a potential hash collision in the revision computation, an improvement to error logging for better observability, and a refactoring suggestion for the new tests to enhance maintainability.
| relative = path.relative_to(skills_root).as_posix() | ||
| digest.update(relative.encode("utf-8")) | ||
| stat = path.stat() | ||
| digest.update(str(stat.st_mtime_ns).encode("utf-8")) | ||
| if path.is_file(): | ||
| digest.update(str(stat.st_size).encode("utf-8")) |
There was a problem hiding this comment.
The current method of updating the hash digest by concatenating file metadata without separators can lead to hash collisions. For example, a file with relative path foo and mtime 12345 would produce the same hash input as a file with path foo1 and mtime 2345, as both result in the byte sequence foo12345 being hashed. This could cause the system to fail to detect changes in skills. Using a separator like a null byte between each piece of metadata will prevent such ambiguities.
| relative = path.relative_to(skills_root).as_posix() | |
| digest.update(relative.encode("utf-8")) | |
| stat = path.stat() | |
| digest.update(str(stat.st_mtime_ns).encode("utf-8")) | |
| if path.is_file(): | |
| digest.update(str(stat.st_size).encode("utf-8")) | |
| relative = path.relative_to(skills_root).as_posix() | |
| stat = path.stat() | |
| digest.update(relative.encode("utf-8")) | |
| digest.update(b"\0") | |
| digest.update(str(stat.st_mtime_ns).encode("utf-8")) | |
| digest.update(b"\0") | |
| if path.is_file(): | |
| digest.update(str(stat.st_size).encode("utf-8")) | |
| digest.update(b"\0") |
| except Exception as e: | ||
| logger.warning( | ||
| "Failed to refresh sandbox skills before reusing session %s: %s", | ||
| session_id, | ||
| e, | ||
| ) |
There was a problem hiding this comment.
Catching a broad Exception is reasonable here to prevent crashes, but logging just str(e) can hide important debugging information like the traceback. Using logger.warning with exc_info=True will automatically include the exception information and traceback in the log, which is much more helpful for diagnosing failures during skill synchronization.
| except Exception as e: | |
| logger.warning( | |
| "Failed to refresh sandbox skills before reusing session %s: %s", | |
| session_id, | |
| e, | |
| ) | |
| except Exception: | |
| logger.warning( | |
| "Failed to refresh sandbox skills before reusing session %s", | |
| session_id, | |
| exc_info=True, | |
| ) |
| async def test_get_booter_refreshes_existing_session_when_skills_change(self): | ||
| """Test get_booter re-syncs local skills before reusing a stale sandbox.""" | ||
| from astrbot.core.computer import computer_client | ||
|
|
||
| computer_client.session_booter.clear() | ||
|
|
||
| mock_context = MagicMock() | ||
| mock_config = MagicMock() | ||
| mock_config.get = lambda key, default=None: { | ||
| "provider_settings": { | ||
| "computer_use_runtime": "sandbox", | ||
| "sandbox": { | ||
| "booter": "shipyard", | ||
| "shipyard_endpoint": "http://localhost:8080", | ||
| "shipyard_access_token": "test_token", | ||
| } | ||
| } | ||
| }.get(key, default) | ||
| mock_context.get_config = MagicMock(return_value=mock_config) | ||
|
|
||
| mock_booter = MagicMock() | ||
| mock_booter.available = AsyncMock(return_value=True) | ||
| mock_booter._astrbot_skills_revision = "rev-old" | ||
|
|
||
| sync_mock = AsyncMock() | ||
| with ( | ||
| patch( | ||
| "astrbot.core.computer.computer_client._compute_local_skills_revision", | ||
| return_value="rev-new", | ||
| ), | ||
| patch( | ||
| "astrbot.core.computer.computer_client._sync_skills_to_sandbox", | ||
| sync_mock, | ||
| ), | ||
| ): | ||
| computer_client.session_booter["test-session"] = mock_booter | ||
|
|
||
| booter = await computer_client.get_booter(mock_context, "test-session") | ||
| assert booter is mock_booter | ||
| sync_mock.assert_awaited_once_with(mock_booter) | ||
|
|
||
| computer_client.session_booter.clear() | ||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_get_booter_skips_resync_when_skills_revision_matches(self): | ||
| """Test get_booter reuses existing booter without redundant syncs.""" | ||
| from astrbot.core.computer import computer_client | ||
|
|
||
| computer_client.session_booter.clear() | ||
|
|
||
| mock_context = MagicMock() | ||
| mock_config = MagicMock() | ||
| mock_config.get = lambda key, default=None: { | ||
| "provider_settings": { | ||
| "computer_use_runtime": "sandbox", | ||
| "sandbox": { | ||
| "booter": "shipyard", | ||
| "shipyard_endpoint": "http://localhost:8080", | ||
| "shipyard_access_token": "test_token", | ||
| } | ||
| } | ||
| }.get(key, default) | ||
| mock_context.get_config = MagicMock(return_value=mock_config) | ||
|
|
||
| mock_booter = MagicMock() | ||
| mock_booter.available = AsyncMock(return_value=True) | ||
| mock_booter._astrbot_skills_revision = "rev-same" | ||
|
|
||
| sync_mock = AsyncMock() | ||
| with ( | ||
| patch( | ||
| "astrbot.core.computer.computer_client._compute_local_skills_revision", | ||
| return_value="rev-same", | ||
| ), | ||
| patch( | ||
| "astrbot.core.computer.computer_client._sync_skills_to_sandbox", | ||
| sync_mock, | ||
| ), | ||
| ): | ||
| computer_client.session_booter["test-session"] = mock_booter | ||
|
|
||
| booter = await computer_client.get_booter(mock_context, "test-session") | ||
| assert booter is mock_booter | ||
| sync_mock.assert_not_awaited() | ||
|
|
||
| computer_client.session_booter.clear() |
There was a problem hiding this comment.
The tests test_get_booter_refreshes_existing_session_when_skills_change and test_get_booter_skips_resync_when_skills_revision_matches are nearly identical, with only the revision values and the expected outcome being different. This code duplication can be eliminated by using pytest.mark.parametrize to create a single, data-driven test. This will make the test suite more concise and easier to maintain or extend in the future.
@pytest.mark.asyncio
@pytest.mark.parametrize(
"old_rev, new_rev, should_sync",
[
("rev-old", "rev-new", True),
("rev-same", "rev-same", False),
],
ids=[
"refreshes_when_skills_change",
"skips_resync_when_skills_revision_matches",
],
)
async def test_get_booter_skill_refresh_logic(self, old_rev, new_rev, should_sync):
"""Test get_booter re-syncs local skills based on revision changes."""
from astrbot.core.computer import computer_client
computer_client.session_booter.clear()
mock_context = MagicMock()
mock_config = MagicMock()
mock_config.get = lambda key, default=None: {
"provider_settings": {
"computer_use_runtime": "sandbox",
"sandbox": {
"booter": "shipyard",
"shipyard_endpoint": "http://localhost:8080",
"shipyard_access_token": "test_token",
},
}
}.get(key, default)
mock_context.get_config = MagicMock(return_value=mock_config)
mock_booter = MagicMock()
mock_booter.available = AsyncMock(return_value=True)
mock_booter._astrbot_skills_revision = old_rev
sync_mock = AsyncMock()
with (
patch(
"astrbot.core.computer.computer_client._compute_local_skills_revision",
return_value=new_rev,
),
patch(
"astrbot.core.computer.computer_client._sync_skills_to_sandbox",
sync_mock,
),
):
computer_client.session_booter["test-session"] = mock_booter
booter = await computer_client.get_booter(mock_context, "test-session")
assert booter is mock_booter
if should_sync:
sync_mock.assert_awaited_once_with(mock_booter)
else:
sync_mock.assert_not_awaited()
computer_client.session_booter.clear()
Summary
Testing
PYTHONPATH=. pytest -q tests/test_computer_skill_sync.py tests/unit/test_computer.py::TestComputerClient::test_get_booter_refreshes_existing_session_when_skills_change tests/unit/test_computer.py::TestComputerClient::test_get_booter_skips_resync_when_skills_revision_matchesCloses #5986.
由 Sourcery 提供的总结
通过在每个 booter 上跟踪修订指纹,并在会话重用时使用它,确保当本地技能发生变化时会刷新沙盒会话。
错误修复:
增强功能:
测试:
Original summary in English
Summary by Sourcery
Ensure sandbox sessions are refreshed when local skills change by tracking a revision fingerprint on each booter and using it during session reuse.
Bug Fixes:
Enhancements:
Tests: